Principal Component Analysis for Sparse High-Dimensional Data
نویسندگان
چکیده
Principal component analysis (PCA) is a widely used technique for data analysis and dimensionality reduction. Eigenvalue decomposition is the standard algorithm for solving PCA, but a number of other algorithms have been proposed. For instance, the EM algorithm is much more efficient in case of high dimensionality and a small number of principal components. We study a case where the data are high-dimensional and a majority of the values are missing. In this case, both of these algorithms prove inadequate. We propose using a gradient descent algorithm inspired by Oja’s rule, and speeding it up by an approximate Newton’s method. The computational complexity of the proposed method is linear to the number of observed values in the data and to the number of principal components. The experiments with Netflix data confirm that the proposed algorithm is about ten times faster than any of the four comparison methods.
منابع مشابه
Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملCombined Unfolded Principal Component Analysis and Artificial Neural Network for Determination of Ibuprofen in Human Serum by Three-Dimensional Excitation–Emission Matrix Fluorescence Spectroscopy
This study describes a simple and rapid approach of monitoring ibuprofen (IBP). Unfolded principal component analysis-artificial neural network (UPCA-ANN) and excitation-emission spectra resulted from spectrofluorimetry method were combined to develop new model in the determination of IBF in human serum samples. Fluorescence landscapes with excitation wavelengths from 235 to 265 nm and emission...
متن کاملCombined Unfolded Principal Component Analysis and Artificial Neural Network for Determination of Ibuprofen in Human Serum by Three-Dimensional Excitation–Emission Matrix Fluorescence Spectroscopy
This study describes a simple and rapid approach of monitoring ibuprofen (IBP). Unfolded principal component analysis-artificial neural network (UPCA-ANN) and excitation-emission spectra resulted from spectrofluorimetry method were combined to develop new model in the determination of IBF in human serum samples. Fluorescence landscapes with excitation wavelengths from 235 to 265 nm and emission...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملHyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations
The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007